Semantic similarity and the generation of referring expressions: A first report

نویسندگان

  • Albert Gatt
  • Kees van Deemter
چکیده

The past decade, has witnessed renewed interest in the Generation of Referring Expressions (GRE) [23, 24, 8, 9, 10, 12, 22]. Broadening the scope beyond earlier work [3, 4, 5], recent proposals involve algorithms that refer to sets as well as individuals, using operations such as set union (‘the cat and the dogs’) and complementation (‘the dog that is not black’). As a consequence, it has become more difficult for a generator to choose among alternative expressions that may be coextensive. This paper is part of a concerted effort to shed some empirical light on the question of expressive choice. The focus is on reference to sets, where a referring expression is built by unifying two or more singletons. Starting with descriptions of the form ‘the N1 and (the) N2’, we investigate whether the semantic similarity of N1 and N2 is relevant in determining the acceptability of the generated NP. Suppose that, in a given domain, an entity e1 can be referred to as either ‘the postgraduate’ or ‘the psychologist’; similarly, e2 can be referred to as either ‘the undergraduate’ or ‘the man on the first floor’. Various alternatives exist for an expression referring to {e1, e2}, e.g.: (i) ‘the postgraduate and the man on the first floor’, (ii) ‘the postgraduate and the undergraduate’, (iii) ‘the psychologist and the undergraduate’. Here, (ii) is arguably better than (i) or (iii). Intuitively, this is because the conjuncts in (ii) are more semantically similar or ‘related’. Moreover, expression (iii) violates the Gricean maxims. The choice of two equally specific [2] but semantically unrelated descriptors, ‘psychologist’ for e1 versus ‘undergraduate’ for e2, might give rise to (false) implicatures, such as that the two entities have nothing in common, thus violating the Gricean Cooperative Principle, and resulting in a description which is less coherent than it might be. Suppose further that e1 and e2, as well as a third entity e3 referred to as ‘the book’, were introduced in a discourse. Subsequent reference to a pair of these entities might be made via a coordinate construction, or some other structure. Considerations of semantic similarity may guide the choice between alternatives; in particular, referring to the set {e1, e2} using an NP conjunction is more felicitous than a similar reference to {e1, e3} (‘the psychologist and the book’). In the latter case, it may be more felicitous to refer to these two entities using different phrases. A third consideration has to do with a user’s comprehension of a generated text. If a description gave rise to false implicatures, or simply sounded odd as a result of an infelicitous choice of descriptors, the quality of the text and its comprehensibility would be reduced. We next describe a correlational study which investigated the relationship of semantic similarity and perceived acceptability of conjoined NPs. Our study is closely related in spirit to [13], which also

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a psycholinguistically motivated algorithm for referring to sets: The role of semantic similarity

This paper explores the role of semantic similarity in content selection and aggregation of expressions referring to sets. Similarity plays a role in ensuring that a referring expression corresponds to a coherent conceptual gestalt. On the basis of corpusbased and experimental evidence we propose an algorithm which (a) separates content selection and aggregation to avoid a combinatorial explosi...

متن کامل

Generating Referring Expressions that Involve Gradable Properties

This article examines the role of gradable properties in referring expressions from the perspective of natural language generation. First, we propose a simple semantic analysis of vague descriptions (i.e., referring expressions that contain gradable adjectives) that reflects the contextdependent meaning of the adjectives in them. Second, we show how this type of analysis can inform algorithms f...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

Generating Referring Expressions that Involve Gradable Properties

This paper examines the role of gradable properties in referring expressions, from a perspective of natural language generation. Firstly, we propose a simple semantic analysis of vague descriptions (i.e., referring expressions that contain gradable adjectives) that reflects the context-dependent meaning of the adjectives in them. Secondly, we show how this type of analysis can inform algorithms...

متن کامل

The D-TUNA Corpus: A Dutch Dataset for the Evaluation of Referring Expression Generation Algorithms

In this paper, we present the D-TUNA corpus, which is the first semantically annotated corpus of referring expressions in Dutch. Its primary function is to evaluate and improve the performance of REG algorithms. Such algorithms are computational models that automatically generate referring expressions by computing how a specific target can be identified to an addressee by distinguishing it from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009